file system
- Europe > Italy > Tuscany > Florence (0.04)
- North America > United States > North Dakota > Bowman County (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- (3 more...)
The AI_INFN Platform: Artificial Intelligence Development in the Cloud
Anderlini, Lucio, Bianchini, Giulio, Ciangottini, Diego, Pra, Stefano Dal, Michelotto, Diego, Petrini, Rosa, Spiga, Daniele
Machine Learning (ML) is profoundly reshaping the way researchers create, implement, and operate data-intensive software. Its adoption, however, introduces notable challenges for computing infrastructures, particularly when it comes to coordinating access to hardware accelerators across development, testing, and production environments. The INFN initiative AI_INFN (Artificial Intelligence at INFN) seeks to promote the use of ML methods across various INFN research scenarios by offering comprehensive technical support, including access to AI-focused computational resources. Leveraging the INFN Cloud ecosystem and cloud-native technologies, the project emphasizes efficient sharing of accelerator hardware while maintaining the breadth of the Institute's research activities. This contribution describes the deployment and commissioning of a Kubernetes-based platform designed to simplify GPU-powered data analysis workflows and enable their scalable execution on heterogeneous distributed resources. By integrating offload-ing mechanisms through Virtual Kubelet and the InterLink API, the platform allows workflows to span multiple resource providers, from Worldwide LHC Computing Grid sites to high-performance computing centers like CINECA Leonardo. We will present preliminary benchmarks, functional tests, and case studies, demonstrating both performance and integration outcomes.
- North America > United States (0.04)
- Europe > Italy > Umbria > Perugia Province > Perugia (0.04)
- Europe > Italy > Sardinia > Cagliari (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- North America > United States > North Dakota > Bowman County (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- (3 more...)
Supporting the development of Machine Learning for fundamental science in a federated Cloud with the AI_INFN platform
Anderlini, Lucio, Barbetti, Matteo, Bianchini, Giulio, Ciangottini, Diego, Pra, Stefano Dal, Michelotto, Diego, Pellegrino, Carmelo, Petrini, Rosa, Pascolini, Alessandro, Spiga, Daniele
Machine Learning (ML) is driving a revolution in the way scientists design, develop, and deploy data-intensive software. However, the adoption of ML presents new challenges for the computing infrastructure, particularly in terms of provisioning and orchestrating access to hardware accelerators for development, testing, and production. The INFN-funded project AI_INFN ("Artificial Intelligence at INFN") aims at fostering the adoption of ML techniques within INFN use cases by providing support on multiple aspects, including the provision of AI-tailored computing resources. It leverages cloud-native solutions in the context of INFN Cloud, to share hardware accelerators as e ffec-tively as possible, ensuring the diversity of the Institute's research activities is not compromised. In this contribution, we provide an update on the commissioning of a Kubernetes platform designed to ease the development of GPU-powered data analysis workflows and their scalability on heterogeneous, distributed computing resources, possibly federated as Virtual Kubelets with the interLink provider.
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.05)
- Europe > Italy > Umbria > Perugia Province > Perugia (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Information Technology > Hardware (1.00)
- Information Technology > Cloud Computing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
Query-based versus resource-based cache strategies in tag-based browsing systems
Gayoso-Cabada, Joaquín, Gómez-Albarrán, Mercedes, Sierra, José-Luis
Tag-based browsing is a popular interaction model for navigating digital libraries. According to this model, users select descriptive tags to filter resources in the collections. Typical implementations of the model are based on inverted indexes. However, these implementations can require a considerable amount of set operations to update the browsing state. To palliate this inconven-ience, it is possible to adopt suitable cache strategies. In this paper we describe and compare two of these strategies: (i) a query-based strategy, according to which previously computed browsing states are indexed by sets of selected tags; and (ii) a resource-based strategy, according to which browsing states are in-dexed by sets of filtered resources. Our comparison focused on runtime perfor-mance, and was carried out empirically, using a real-world web-based collec-tion in the field of digital humanities. The results obtained show that the re-source-based strategy clearly outperforms the query-based one.
On the Generalizability of Machine Learning-based Ransomware Detection in Block Storage
Reategui, Nicolas, Pletka, Roman, Diamantopoulos, Dionysios
Ransomware represents a pervasive threat, traditionally countered at the operating system, file-system, or network levels. However, these approaches often introduce significant overhead and remain susceptible to circumvention by attackers. Recent research activity started looking into the detection of ransomware by observing block IO operations. However, this approach exhibits significant detection challenges. Recognizing these limitations, our research pivots towards enabling robust ransomware detection in storage systems keeping in mind their limited computational resources available. To perform our studies, we propose a kernel-based framework capable of efficiently extracting and analyzing IO operations to identify ransomware activity. The framework can be adopted to storage systems using computational storage devices to improve security and fully hide detection overheads. Our method employs a refined set of computationally light features optimized for ML models to accurately discern malicious from benign activities. Using this lightweight approach, we study a wide range of generalizability aspects and analyze the performance of these models across a large space of setups and configurations covering a wide range of realistic real-world scenarios. We reveal various trade-offs and provide strong arguments for the generalizability of storage-based detection of ransomware and show that our approach outperforms currently available ML-based ransomware detection in storage. Empirical validation reveals that our decision tree-based models achieve remarkable effectiveness, evidenced by higher median F1 scores of up to 12.8%, lower false negative rates of up to 10.9% and particularly decreased false positive rates of up to 17.1% compared to existing storage-based detection approaches.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- (5 more...)
From Commands to Prompts: LLM-based Semantic File System for AIOS
Shi, Zeru, Mei, Kai, Su, Yongye, Zuo, Chaoji, Hua, Wenyue, Xu, Wujiang, Ren, Yujie, Liu, Zirui, Du, Mengnan, Deng, Dong, Zhang, Yongfeng
Large language models (LLMs) have demonstrated significant potential in the development of intelligent applications and systems such as LLM-based agents and agent operating systems (AIOS). However, when these applications and systems interact with the underlying file system, the file system still remains the traditional paradigm: reliant on manual navigation through precise commands. This paradigm poses a bottleneck to the usability of these systems as users are required to navigate complex folder hierarchies and remember cryptic file names. To address this limitation, we propose an LLM-based semantic file system ( LSFS ) for prompt-driven file management. Unlike conventional approaches, LSFS incorporates LLMs to enable users or agents to interact with files through natural language prompts, facilitating semantic file management. At the macro-level, we develop a comprehensive API set to achieve semantic file management functionalities, such as semantic file retrieval, file update monitoring and summarization, and semantic file rollback). At the micro-level, we store files by constructing semantic indexes for them, design and implement syscalls of different semantic operations (e.g., CRUD, group by, join) powered by vector database. Our experiments show that LSFS offers significant improvements over traditional file systems in terms of user convenience, the diversity of supported functions, and the accuracy and efficiency of file operations. Additionally, with the integration of LLM, our system enables more intelligent file management tasks, such as content summarization and version comparison, further enhancing its capabilities.
- North America > United States > New Jersey (0.04)
- North America > United States > Minnesota (0.04)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- (4 more...)